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(54) Method and modules for transmitting and receiving compressed data 



(57) The present invention relates to a method for 
transmitting digital audio and/or video data. The 
invention furthermore relates to a first and a second 
communication device and a first and a second program 
module therefor. 

. The method comprises the steps of: 

- Dividing of digital audio and/or video data (S1 D S2) 
of a first data stream (IDS) into first frames (F1-F5). 

- . Frame-based encoding of the first frames (F 1 -F5) into 

second frames (V1P-V3P; ^V1-V20 containing 
compressed data of the first data stream (IDS). 

- Generating a flow of messages of a real time protocol,, 
a majority of the messages containing at least one 
first frame (V1) of the second frames and a state 
information (SI1) needed to decode the first frame 
(VI); The state; information (Sit) is derrvable from a 
second frame (VI P-V3P) provided mat it is available, 
The second frame (V1 P-V3P) precedes the first frame 
(VI) and is contained in a message preceding the 
message containing the first frame (VI) to be decoded. 



Transmitting the messages from a first (CD; T11; S21; 
ACC) to a second communication device (CD; T31 ; 
T12). 

Decoding the second frames (V1P-V3P; V1-V20); 
thereby evaluating the respective state information 
(SIT) contained in a : respective message" to 
decompress the respective first frame (V1) being 
contained in the respective message (Ml) 




VIP J ' V3P: J 
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Background of the Invention 

[0001] The present invention relates to a method for transmitting digital audio and/or video data. The invention 
furthermore relates to a first and a second communication device therefor, a first program module for a first 
communication device therefor and a second program module for a second communication device therefor. 
[0002] Digital audio and/or video data is usually compressed prior its transmission in order to save transmission 
capacity. The compression is especially of advantage when real time transmission is required. To this end so-called 
codecs are used. Codec is an abbreviation for compression/decompression. A codec can be either a software application 
or a hardware device that processes audio/video data through complex algorithms, which compress the data at the sending 
side and decompress the data at the receiving side for playback. 

[0003] The compression and decompression may be performed sample by sample ("sample-based") or, e.g., frame by 
frame ("frame-based"). Each frame is a data block containing a predetermined amount of audio and/or video data, that is 
encoded into another block or frame containing compressed data. A GSM codec, e.g., performs frame-based 
compression/decompression (GSM = Global System for Mobile communication). 

[0004] The frame-based compression/decompression may be performed "state-based/stateful" or "stateless". 
[0005] "Stateless" means that each frame is encoded independently and may be consequently decoded independently 
as well. The encoded frames may be transmitted from a first to a second communication device via a network, for 
example from a voice terminal to a Personal Computer via the Internet. If real time transmission is requested, a real 
time protocol, e.g. the RTP (Real-time Transport Protocol) as recommended by the IETF (Internet Engineering Task Force), 
is typically used. Even if an RTP-packet containing one or more encoded frames out of stream of RTP-packets is lost, the 
frames following the lost frames can basically be decoded without loss or nearly without loss. There is no loss 
propagation effect caused by the lost frames. There is only loss caused by the compression / decompression operation 
and, of course, the lost data of the lost frame(s). 

[0006] "State-based/stateful" means that each frame is encoded using information derived from previous frames. This 
method of compression may look for information that is not necessary for continuity to the human ear or eye. For example 
a change of frame in comparison with a preceding frame is evaluated for the compression. One advantage of state- 
based/stateful compression is its better compression quality and/or efficiency. However, if such state-based/stateful 
encoded frames are transmitted during a real time communication, the quality requirements are relatively high. If for 
example an RTP-packet containing encoded frames is lost, the frames being contained in the RTP-packets following the 
lost RTP-packet cannot be decoded or can only be decoded with significant toss. There is an undesirable propagation 
effect that deteriorates substantially the quality of the real time communication. 

[0007] In this context it has to be noted, that during a non-real time communication, for example download of video 
data from a server, basically a lost packet may, no doubt, be re-requested by the receiving side and re-transmitted by 
the sending side. However, the re-transmission causes considerable load at all sides involved, i.e., the re-requesting 
receiver, the re-transmitting sender, and network connecting the sender and the receiver. 

Summary of the Invention: 

[0008] Accordingly one object of the invention is to provide a method, devices and modules capable'of an improved 
transmission of frames containing state-based/stateful compressed video and/or audio data 

[0009] This object is to be attained by methods in accordance with the technical principle of claim 1, a first and 
a second communication device therefor, a first program module for a first communication device therefor and a second 
program module for a second communication device therefore, said com devices and program rmdules being in 

acc»ixlance with techni(^l principles o claims ; ••• 

[0010] In this respect "one principle of 'the invention is that digital audio and/or video data of a first data 
stream is divided into first frarnes The digital audio data contains, e g , voice data, music data (for example for an 
interactive karaoke) and the like Subsequently, the first frames are encoded fram^by-fraime into second frames The 
second frames contain the data of the first frames as compressed data It has to be noted, that one second frame may 
contain compressed data of two or more first frames or of a fraction of first frame The secbbd frames are packed into 
messages of a real time protocol- thereby ^ generating a flow of messages; * < 

[0011] Each of these messages contains at least one frame of the Second frames Furthemiorie, each of these 
messages or at least a majority/ of tfije messaged contain Estate infoirn^ to d|tx>cfe feeJat l^t;6ne fra me 

contained in the respective message The state information is derived from iat least one second frame of the second 
frames - provided that it is available. The at least one second frame of the second frame's is a fra me preceding the at 
least one frame to be decoded and is contained in a message preceding the message containing the at least one frame to 
be decoded Even if it is preferred, that all messages contain a state information, rt may be sufficient that only the 
'majority of the messages 1 contains a state information For example the first message of the Stream may not comprise a 
state information^; r ;< .:■■([■:■■■ ^5; /.A'l'.--- ■ ];. '. "■j:^:: - f^v^A^X?- "o* 

[0012] The first frames, the 'onginal' frames, are for example encoded according to the formula 

• - • ■ ■; / . : V -_ : V^E^,^) • . 

whereby V| is the second frame generated by an encoding operation E: F, is the (first) frame to be encoded and S, is a 
state information derived from previous frames, e.g. from 3. preceding frames F M , Fj. 2 , F^ according to, e.g„ the formula 
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S, = S(F M ,F, 2 ,F, 3 ) 

whereby S is the state information evaluation operation evaluating the frames F M , F^, Fj_ 3 . A message according to the 
invention contains the second ('compressed') frames and the state information. The message is transmitted from a first 
to a second communication device and may have, e.g., a payload like: 

P|.V If V M1 V M1 V M> V H J 

whereby V h V i+1 , V^ 2 , V i+3 , V j+4 are subsequent second (= encoded) frames and Sj is a state information needed to 
decode the first frame V { of the message. It is clear, that the state information S, may at least partly be used to decode 
also one or more of the frames V j+2 , V j+3 , V j+4 . Furthermore, the message may contain more or less frames than the frames 

Vj + i,V l+2 ,V i+3 ,V j+4 . 

[0013] The second communication device decodes, e.g;, by means of a program module according to the invention, 
the second frames contained in the received messages. Thereby the second, communication device evaluates the 
respective state information contained in respective message to decompress the respective at least one first frame that 
is contained in the respective message. 

[0014] The respective state information is at least in such case evaluated, if one or more preceding messages and 
the frames contained therein are lost. In such a scenario the receiving second communication device decodes the first 
15 frame V| of the message for example according to the formula: 

,;„ F^Dp^S,) 
whereby D is the decoding operation on a second frame V r evaluating the state information S jf thereby generating a 
decoded version P, of the original first frame F,. It has to be noted, that usually compression/decompression implies 
certain loss and/or the state information Sj is to some extend insufficient. Thus, the original first frame F, cannot, completely 
20 be recovered but a decoded version P, that is to some extend different from the original frame Fj. 

[001 5] If however there were no loss during compression/decompression, the above formula would be 

Fi = D(V j; S,) 

[001 6] The subsequent second (= encoded) frame V i+1 may be decoded to a frame P M , e.g., according to the formula: 

Whereby S' j+1 is an estimate of a state info based on, e:g. f the available previous encoded frame V, and/or further 
previous frames, e.g., V M , Vj_ 2 , and/or the available already decoded frame F |f and/or further previous decoded frames, 
e.g., F M , FV 2 , for example: 
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SVi=ES a <V,) 

s , l+ i=Es a (y ( ,v H! v, 2 ) 

;,SV 1 =ES a (V i iFr j ) 
, S 1 1+1 =ES a (F' i ) 



whereby ES a is an estimate function evaluating the encoded frames Vi, V H> and/or the already decoded frames Fj, 
Pm. F>2 thereby determining the state information S' l+ i The estimate function may optionally also evaluate the state 
information Sj for example; . £ J- ,. k .. -.. , i> , V ", < \ ■■. . .-..i /: \ . '. . -. 

45, SVrES^FVS,) 

[0Q17] The further subsequent second (= encoded) frames Vj+ 2 , V j+3 may: bq decoded accordingly, whereby a further 
estimate function ES b may evaluate more preceding frames, for example: 



[0018] . * The further frames V| +2 , Y, +3 , are accordingly decoded. ■ -i / /- - 

[0019]. Advantageous further.effects of the invention will be seen from the dependent claims and the specification. 
[0020] Preferably the respective, state mformatioa contained in one of the messages is used to decode only the 
first frame of the message, A frame following the first frame may be decoded by a state information derived from the 
55 first frame and/or decoded :data.determined by means of the first frame, e.g., the decoded first frariie. 

[0021] Not only a state information may be contained in the messages but also additional information; e g , about a 
type of a encoding scheme and/or about a type of the state information and/or about the* respective length of the state 
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information and/or about the number of frames contained in the respective message 

[0022] The state information is preferentially comprised in a header preceding the first frame of a respective 
message. This header may also at least partly contain the additional information mentioned above. The additional 
information may also be comprised in an additional further header. 

[0023] A suitable embodiment of the invention provides that the state information S; is a complete state information 
needed to decode at least the first frame Vj of the message, 'Complete state information* in a sense, that the frame V, may 
be decoded without loss or nearly without loss by means of the state information S,. 

[0024] However, it may be the case, that the first frame V, (and/or one or more of the further 

frames V j+1 , V^, V i+3 , V j+4 ) may be decoded based on the state information Sj not with best but with satisfactory 
quality. To obtain the best quality, the decoding (receiving) device may evaluate also at least one frame preceding the 
frame Vj, e.g., frames V M , V,. 2 , if the message containing the preceding frames V M , V k2 is available (and not lost). 
[0025] If the message containing the preceding frames V M , was available the first frame V, may also be decoded 

• only by a state information that the decoding device gains from the preceding frames V M , V w . The decoding device may 
additionally evaluate the state information Sj being contained in the respective message containing the first frame Vj to be 
decoded. 

[0026] The first frames or at least nearly each of the first frames to be encoded contain at least two samples of 
digital audio and/or video data. Preferably a first frame contains a plurality of samples. . 

[0027] The following description will serve to explain the advantages of the invention on the basis of working 
examples as illustrated in the accompanying drawings. 

Brief Description of the Drawings: 

[0028] 

Figure 1 shows an arrangement for the performance of the method in accordance with the invention usinq networks 
NW1.NW2, NW3. 

Figure 2 shows a block diagram for a communication device CD in accordance with the invention, the 

communication device CD being a communication device of the networks NW1 , NW2, NW3 according to 
figure 1 . 

Figure 3 shows a message M1 in accordance with the invention, the message M1 being for example generated by 

the communication device CD according to figure 2. 
Figure 4 shows very diagrammatically an encoding procedure in accordance with the invention. 
Figure 5 shows very diagrammatically a decoding procedure in accordance with the invention. 
Figure 6 shows a modification of the encoding procedure according to figure 4. 
Figure 7 shows a modification of the decoding procedure according to figure 5. 

Detailed Description of the invention: 

[0029] Reference will now be made in detail to the! present preferred embodiments of the invention as illustrated in 
the accompanying drawings, jn describing the preferred embodiments and applications of the present invention, specific 
terminology is employed for the sake of clarity. However, the invention is not intended to be limited to the specific 
terminology so selected, and it is understood that each specific element includes all technical equivalents which 
operate in a similar manner to accomplish a similar purpose w 

[0030] Figure 1 shows a very diagrammatically presented arrangement by way of example, with which the invention 
may be put into practiced Netw6ri<s;NVyi and NW3 are interconneicted Via a, netvwrk NW2; The networks NW1 > NW3 are 
routed and/or switched networks The network NW1 is for example an integrated services digital network (ISDN) and/pr an 
access network, the network NW2 a mobile radio network providing data transmission, -e.g. via Wireless Application 
Protocol (WAP), or a local area network (LAN), e.g., an Ethernet based LAN. The intermediate network NW2. is, e.g., the 
Internet. The networks NW/I-NW3 support real time communication based on a real time protocol, e.g !, the RTP (Real-time 
Transport Protocol) according to the definitions of the IETF (Internet Engineering Task Force) 

[0031] The network NW1 comprises inter alia terminals T11, T12 and an access server ACC, the network NW3 a 
terminal T31 . the network NW2 comprises a router R21 and a server S21: The networks NW1-NW3 may comprise further 
communication devices not shown in the drawing, e.g., further routers, terminals, hubs, switching centers and the , 
like. The terminals T1 1 , T1 2 T31 may be for example personal computers or mobile teleiphone terminals. 
[0Q32] v For simplification the terminals T11, W2 ^ the 

of similar design and only diagrammatically depicted as block diagrams of functions Figure '2 shows, eg , a block 
digram for a .communication device CD that may be the terminal T11, T12 T3T, the router R21 or the server S21 1 The 
/ wm of data,; e:&.- via 

* connections CT1; C1^ 
ISDN (integrated, se 

means TR by connections, which are not illustrated. The control means CPU are for example processors or processor 
arrays with which a program code of program modules may be executed, which are stored in memory means MEM, for 
example program code of an application program module AP, a coding module COD, a decoding module DEC and a : 
message generator and evaluator module PTi The memory means MEM are for instance in the form of hard disks or RAM 
modules. Furthermore the/communication device CD may have display means as for example LCD's (liquid crystal 
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displays) and input means, for example a keyboard and/or a computer mouse. Further components which are not illustrated 
are speakers and microphones for voice input and output. The communication device CD is run by an operating system as 
for instance Unix, Windows NT or preferably by a real time operating system. 

[0033] The communication device CD represents both a first and a second communication device according to the 
invention. To this end, the communication device CD comprises a first and a second program module (COD, DEC) 
according to the invention. In the present embodiment the first and second program modules are represented by a module 
CO, that comprises the program modules COD, DEC and PT. 

[0034] The application program module AP is for example a data base providing video and/or audio data or an audio 
and/or video player application, a karaoke player, a telephone conference application for providing a voice and/or video 
conference service or the like. 

[0035] In a first example the terminal T11 establishes a connection C11, C12 to the terminal T31 via the network 
10 NW2, i.e., via the router R21. The terminals T1T, T31 communicate audio data, e.g. voice data or music The 
communication may be interactive. The example shows a message flow DS1 containing this audio data. The terminal T11 
sends the message flow DS1 to the terminal T31 using a real time protocol, e.g., the RTP (Real-time Transport Protocol) 
according to. the IETF. . . ' 

[0036] The message flow DS1 is based on a first digital data stream IDS of audio data that is generated by the 
application program AP of the terminal T1 1 . The application program AP forwards the data stream IDS to the module COD 
15 A division function DIV that is presently a part of the module COD divides the digital audio data of the first data 
stream IDS into first frames F.1 to F5 and possibly further frames not shown. The frames F1 to F5 contain audio samples 
For example the frame F1 contains samples S1 and S2 and further samples which are not denominated in particular. 
[0037] The division function DIV might also be separate module or a part of a module separate from the module COD, 
e.g., part of the application program AP. In this scenario, the module COD is equipped to only receive the first frames 
F1 to F5 that contain already. divided digital audio data. 

[0038] The module COD encodes the first frames F1 to F5 and possibly further first frames not shown into second 
frames V1 P- V3P; V1 to V20 and possibly further second frames not shown. The module COD performs frame-based 
encoding of the data stream IDS contained in the first frames F1 to F5. The second frames V1P- V3P* V1 to V20 contain 
compressed data of the data stream IDS. 

[0039] The module COD encodes the first frames F1 to F3 into the second frames V1 P- V3P and the first frames F4, 
25 F5 as well as further first frames not shown into the second frames V1 to V20 and further second frames not shown. 

[0040] The module COD forwards the second frames V1 P- V3P to the message generator and evaluator module PT 
The module PT packs the frames V1 P - V3P into a message M1P that is not shown in detail. The module PT forwards the 
message M1P to the connecting means TR. The connecting means TR.in turn transmit the message M1P via the 
connection C1 1 , CI 2 to the terminal T31 . The message M1 P may be the first message of the message flow DST. 
[0041] The encoding of the frames F4 and F5 is explained in more detail. The module COD encodes the frame F4 into 
30 the. frame V1. To this end, the module COD evaluates the preceding second frames V1 P - V3P, thereby determining a 
state information SI1. The module COD uses the state information S1 1 to encode the frame F4 into the frame VI 
[0042] Accordingly, the module COD encodes the frame F5 into the frame V2 based on a state information SI2 that is 
derived from frames V2P, V3P and VI . The further second frames V3 to V20 may be generated accordingly. 
[0043] It has to be noted that the state information S1 1, SI2 and further state information not shown may also be 
derived from only one preceding frame or from more than two or three preceding frames. The state information SI1 might 
3o for example be derived from the frame V3P or from a part of frame V3P. ■ 

[0044] It is furthermore not mandatory that one first frame F1-F5 is encoded in one second frame V1 P-V2. It may 
also be the case that a second frame is based on more than r one first frames or that one first frame is encoded into more 
than one second frames * , 

[0045] The module COD provides the second frames V1 to V20 and the state information SI1 for the module PT The 
..) 4Q module PT generates? message Ml compnsing the state information SI1 and the second frames V1 to V20 The state 
information S1 1 will be needed to decode the first frame V1 of the secbhd frames V1 to V20 The module PT fbrvvards the 
message Ml to the connecting means TR which; transmit the message M 1 via the connection C 1 1, C12- to thW terrriirial T31 
The message Ml js the second, message of the message flow DS1 Accordingly, the terminal Ti 1 may send further 
messages within the message flow DS1. - - ■ '\: 

[0046] It has to be noted that in the present embodiment the message M1 contains a stete information in contrast to 
45 the message#1 P, which is the first message of the message flow DS1 and which contains no state information, because 
there is no frame preceding the frame V1 R available It may however be the case that also the first message of a 
message flow contains a,state information For example the message M1 P may contain a copy of the frame V1R as a state 
.. information. ; v ; ;v, ,-• ( . •• ' • -.■!*. V >. •3; ; ;.;..;:.-^ ' .^•v^'^ : .^>: : ; : =-':'\ 

[0047] The message Mi is a RTP message (RTP = Real-time Transport Protocol) with a RTP header RPH and a RTP 
payload RPL, The payload RPL contains a header HD and a frame payload FX with the second frames V1 to V20 The 
50 ■ • header HD comprises t^ 

.[0048] The additional Jnforma^ 
: -frame, payload FX. The additional information Al comprises: presently a type 6f ■;' 
s.cheme-ESjndicates,thp. : TO encoding used by the module COD. Furthermore, the additional information Al contains 
a type information TY, about the type of the state information SI1 and a length information LI indicating the length of 
" 55 \ Re state inforrriation SI1 ? Thus, a state information rnay^ave a variable length. The additional infdrration Al cbmprise^ 
also a number information NR about the v number of the frames contained in the frame payload^FX^The^ 
number information NR of the messages M1 P, M1 are forexarriple 3 and 20 respectively. ^ * 

[0049] The terminal T31 decodes the frames^ 1 P-V3P; V1 V20 contained in the messages M1 P, M1 respectively. To - 
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this end the terminal T31 is equipped with the decoder/decoding module DEC that is a second program module according 
to the invention. The decoding module DEC is in the present embodiment a part of the module CO that contains also the 
modules COD and PT. It has however to be noted that the modules COD and DEC may also be separate modules In 
cooperation with the decoding module DEC the module PT serves as a message evaluator module. The module PT might 
also be a component of the module DEC or module separate from the module CO. 

[0050] The terminal T31 receives the messages M1P, M1 and possibly further messages of the message flow DS1 
with the aid of the connecting means TR. The connecting means TR convey the messages M1 P, M1 to the module PT that 
evaluates the messages M1 P, M1 . The module PT forwards for example the frames V1P - V3P; V1 - V20 frame-by-frame 
to the module DEC. 

[0051] The module DEC decodes/decompresses the frames V1P - V3P; V1 - V20 of the messages M1P, Ml, thereby 
generating decoded frames DF1 - DF5 and further frames not shown. The decoded frames DF1 - DF5 are very similar to 

10 the original first frames F1 - F5. But since coding/decoding causes typically certain loss, the module DEC cannot 
recover the original first frames F1 - F5. The module DEC forwards the data of the decoded frames DF1 - DF5 as a data 
stream DDS to the application program module AP that is for example in the case of the terminal T31 an audio player. 
[0052] The decoding operation on the frames V1 P - V3P is not explained in detail. Instead, the decoding operation on 
the second frames V1 - V2 is explained. Provided the message M1 P is available the module DEC may decode the frame 
V1 based on the frames V1 P - V3P, e.g. based on a state information ESI1 derived from the frames V1 P - V3P. 

15 [0053] If however the message M1P is lost, e.g., due to a transmission error, the frames V1 P - V3P are not 
available. In such a scenario the decoded frames DF1 - DF3 may not be recovered due to the missing frames V1P - V3P. 
The frames DF4, DF4 may however be recovered. To this end, the module DEC evaluates the state information SI1 to 
decode/decompress the frame V1 thereby generating the frame DF4. 

[0054] Subsequently, the module DEC decodes the frame V2 thereby generating the frame DF5. To decode the frame 
V2 the module DEC evaluates the frame V1. Additionally, the module DEC may also evaluate the state information SI1. 
20 The evaluation of the frame V1 and/or the state information SI1 produces an estimated state information ESI2 that is 
similar to the state information SI2. The estimated state information ESI2 is only similar because the frames V2P, V3P 
are in the present example not available to recover the original state information SI2. 

[0055] If however only one preceding frame or only a part of a preceding frame is needed to recover a state 
information, not only the estimated state information ESI2 but the original state information could be recovered SI2. In 
25 a modification of the example of figure 5 only the frame V1 (and not the frames V2P, V3P) could be for example 
sufficient to recover the original state information SI2. 

[0056] Figure 6 shows a modification of the encoding example of figure 4. In this embodiment a state information 
SI12 is generated based on the first frames F1 to F3 and a state information SI22 is generated based on the first frames 
F2 to F4. The state information SI12, S22 correlate to the state information SI1, SI2 respectively. The state information 
SI12 may be packed into a message M12 similar to the message M1 that comprises the state information SI1 . 
30 [0057] Accordingly, the frame V1 may be decoded using the state information SI12 as shown in figure 7. 
Subsequently, for decoding the frame V2, the module DEC determines an estimated state information ESI22 based on the 
state information SI12 and/or the decoded frame DF4 that is derived from the encoded frame V1: The state information 
ESI22 may be also based on the frame V.1 . 

[0058] The encoding of frames and the generation of RTP messages containing a state information according to the 
invention may be performed by various devices. 
35 [0059] Instead bf the terminal T11 for example the router R21 or any 4 other device not shown may provide the 
encoding of frames and/or the packing of the encoded frames into RTP messages that also contain a state information 
according to the invention If fdf example the terminal T1 1 would have not been capable of the inventive encoding 
operation explained above the router R21 could serve as an encoder according to the invention 

[006Q] Also the decoding operation may be performed by the router R21 or any other device not shown If for example 
4 the terminal T31 would have not been capable of the inventive decoding operation explained above the router R21 could 
serve as a decoder according to the invention/ - - 

[0061] In a further embodiment of the invention the terminal T1 1 establishes the connection to the terminal T31 via 
the access server ACC The access server ACC provides access to the network NW2 for communication devices of the 
network NW1; Fu^ 

services for, e g., the terminal T1 1 . " 

45 [0Q62]; The terminal T11 sends for example a digital data stream DS1 1 to the access server ACC which in turn divides 
the data stream DS1 1 into first frames* encodes the first frames into second frames, packs the second frames Jnto RTP 
messages containing additionally; a state information as explained above. The access server ACC forwards the RTP 
Jmessajges in a-flow transmits the RTP^messages to the terminal TSIv-U ^ 

[0063] In the opposite direction the access server ACC may decode messages sent by the terminal T31 to the terminal 
^Jt^fe m^s^es may also ^ information needed to decode -at least tHe" first ehcoded 

50 frame of? a respective message as explained above. ~ % - t - J 

[0064] " The server S21 is a video "server providing real tiTO 

data files which the server S21 encodes case-by-case. The server S21 divides the digital video data ihtb first frames, 
^ncbiJes the fi^t frames into ^ 

sjate infom^tibn Subsequently, the server^S21 fbrWards the RTP messages within a me^geflbwve g fa message flow 
P?2, to a device requesting the video data, eg. to the terminal T1 2. The terminal T12 extracts the second frames and 
" the state information from the messages and decodes the second frames as explained above 

[0065] The server S21 may also store video data as already encoded second framed in combination with state 
information in a data base, e.g., represented by the module AP. In such a scenario there is no encoding operation 
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necessary. The server S21 just packs the encoded second frames together with an additional a state information as 
explained above into RTP messages. 

[0066] Although the foregoing invention has been described in some detail for purposes of clarity of understanding, 
it will be apparent that certain changes and modifications may be practiced within the scope of the invention. 
[0067] For example a message according to the invention may carry 'difference frames' instead of encoded second 
frames. Each difference frame represents the difference value of two subsequent second frames. In such a scenario the 
5 state information contained in a respective message is, e.g., the original encoded second frame that is used to 
determine the first difference frame of the respective message. 

Claims 

1. A method for transmitting digital audio and/or video data, the method comprising the steps of: 

10 

dividing of digital audio and/or video data (Si, S2) of a first data stream (IDS) into first frames (F1-F5), 

frame-based encoding of data of said first frames (F1-F5) into second frames (V1P-V3P; V1-V20), said second 
frames (V1 P-V3P; V1-V20) containing compressed data of said first data stream (IDS), 

generating a flow (DS1; DS12; DS2) of messages (M1P, M1) of a real time protocol, a majority of said messages 
(M1P, M1) containing at least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) and a state 
information (SI1) needed to decode said at least one first frame (V1, V2-V20) of said second frames (V1P-V3P; 
V1-V20), said state information (S1 1) being derivable from at least one second frame (V1P-V3P) of said second 
frames (V1P-V3P; VI -V20) provided that said at least one second frame (V1P-V3P) of said second frames (V1P- 
V3P; V1-V20) is available, said at least one second frame (V1P-V3P) of said second frames (V1P-V3P; V1-V20) 
preceding said at least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) and being 
contained in a respective message preceding (M1P) a respective message (M1) containing said at least one first 
frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) and said state information (SI1), 

transmitting said messages (M1P, Mi) from a first communication device (CD; T11; S21; ACC) to a second 
25 communication device (CD; T31; T12), and 

decoding said second frames (V1P-V3P; VI -V20) contained in said messages (M1P, Ml), thereby evaluating said 
respective state information (SI1) contained in a respective message (Ml) to decompress said respective at 
least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) being contained in said respective 
message (M1). 

30 . \ 

2. The method as claimed in claim 1, characterized in that a respective state information (SI1) contained in one of 
. said messages (M1P, Ml) is used to decode said first frame (V1) of said message (M1P, Ml), and in that a further 

frame (V2) following said first frame (V1 ) and being also contained in said message (M1P, Ml ) is decoded by a state 
- information (ESI2) derived from said first frame (VI) iand/pr from said state information (Sltj of said message (M1) 

and/or from a decoded frame (DF4) being denved from said first frame (V1 ) 

3. The method as claimed in claim 1, characterized in that said second communication device (CD, T31, T12) 
determines a state information (ESI1) needed to decompress a respective at least one first frame (VI, V2-V20) of 
said second frames (V1P-V3P; V1-V20) being contained in a respective message (Ml) by means of at. least one 

j 40 second frame (V1P-V3P) of said second frames (VI P-V3P; V1-V20) being contained in a message (Ml P) preceding 

said respective message (M1) if said preceding message (M1P) is available, and in that said second communication 
• : device (CDr T31, T12) decompresses said at least one first frame (V1) of said second frames (V1 P-V3P; V1-V20) by 
means of said self-determined state information (ES1 1 ) and/or by means of said state information (SI1) being 
contained in said respective message (Ml). 1 , , - - . 

45 4. The method as claimed in clairifi 1, characterized in that said messages (M1P, Ml) further contain additional 
infbrrtiatiori (Al) about a type of a encoding scherne (ES) and/or about a type (TY) of said state information (SI1) 
and/or about the respective length (LI) of said state infomriatipn (Sit) and/or about the number of frames (NR) 

'yY'> cqhtained'Jnlhjef^res^ v 7 : '; ; - ^ V:-.. ;;; ."•.^ 

5. The method as claimed in claim 1 and/or claim A t characterized In that said additional information (Al) and/pr said 
5 9 v- state infpnrtatibrir(SII) ^ respective message 

6. The method as claimed in claim 1 , characterized in that each of said first frames (F1-F5) or at ieast nearly each 
■ # ;of said 

55 ' 7. A first communication device for transmitting digital, audio and/or video data (S1 f S2), said first communication 
device (CD, TR11, ACC, S21) comprising means (CPU, ( MEM, TR, CO) fpr Carrying out the steps of: 
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" ^™ ased encodin 9 of data of first frames (F1-F5) into second frames (V1 P-V3P; V1-V20), said first framo* 

Ei?5OTS?i di9i ! 3 ' • Udi ° and/ ° r vl ?° data (S1, S2) of a firet data stream < IDS > and said second frames 
(V1 P-V3P; V1 -V20) containing compressed data of said first data stream (IDS), 

5 " ?M?| ra ^ 3 fl0W {DS1: DS12: DS2) of usages (M1P, M1) of a real time protocol, a majority of said messages 

(M1P. M1 ) containing at least one first frame (VI. V2-V20) of said second frames (V1 P-V3P- V1-V20) and a state 

!« 0 !^!' 0n S'V need , ed t0 decode said at ,east one f,rst «ame.(V1. V2-V20) of said second frames (V1P-V3P- 
V1.-V20), said state information (SI1) being derivable from at least one second frame (V1P-V3P) of said second 

SKfJSS?* V -.' Kf 0) P T d , e , d ** Said 8t ' eaSt ° ne ^ frame ° f said secondSs (V1P 

V3P, V1-V20) is available, said at least one second frame (V1P-V3P) of said second frames (V1P-V3P- V1-V20) 
10 preceding said at least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) and beino 

contained in a respective message (M1P, Ml) preceding a respective message (M1P, M1) containing said at least 
one first frame (V1 , V2-V20) of said second frames (V1 P-V3P; V1-V20) and said state information (SM ) and 

transmitting said messages (M1P, M1) to a second communication device (CD; T31;T12). 

15 

8. A first communication device as claimed in claim 8, characterized in that it comprises means (DIV) for dividina of 
said digital audio and/or video data (S1 , S2) of said first data stream (IDS) into said first frames (F1-F5). 

9. A seconc I communication device for receiving digital audio and/or video data (81. S2), said second communication 

device (CD; T31; T12) being designed to cooperate with a first communication device (CD; T11 ; S21- ACC) accordina 

20 to claim 7 or 8 said second communication device (CD; T31 ; T1 2) comprising means (CPU, MEM, TR, CO) for carrying 

out tne steps of: ' ° 

receiving messages (M1 P, M1 ) sent by a first communication device (CD; T1 1 ; S21 ; ACC), and 

2s - decoding said second frames (V1 P-V3P; V1 -V20) contained in said messages (M 1 P, M1 ), thereby evaluating said 

respective state information (SI1 ) contained in a respective message (M1P, Ml) to decompress said respective at 
least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) being contained in said respective 
message (M1P, M1). r 

30 10 - * first program module for a first communication device (CD; T1 1 ; S21 ; ACC). said first program module (CO- COD 
* >^T^ .^enteininp^pmgram^ code - able* to be executed- by a control means (CPU) of said first -communication device (CD* ■ 
I n; bZl; ACC) and said first program module making said first communication device (CD; T1 1; S21; ACC) carrvino 
out the steps of: :'_■*.*» 

- frame-based encoding of data of first frames (F1-F5) into second frames (V1 P-V3P; V1-V20) said first frames 
35 d ^;:««' d »'ancWE>r video data (S1. S2) of a first data stream (IDS) and said second frames 

(V1P-V3P, V1-V20) containing compressed data of said first data stream (IDS), 

. /. ^nerating a flow (DS1, DS12. DS2) of messages (M1P. M1)of.9 real.time protocol. a majority of said messages 
<M;1P,M1) containing at least one first frame (V1, V2-V20) of said second frames (V1P-V3P, V1-V20) and a state 
!?lS!' on (S'1) needed to decode said at least bhe first frame (V1, V2-V20) of said second frames (VI P-V3P- 
40 V1-V20), said state information (SI1) being derivable from at least one second frame (V1P-V3P) (V1P-V3P V1- 

V20) of said second frames (V1P-V3P; V1-V2Q) provided that said at least one second frame (V1 P-V3P) of'said 
A e ^ r l d ,w'^ S ,iyi P " V3P; V 1 "^ 0 ) 's available, said at least one second frame (V1P-V3P) of said second frames 
(V1P-V3P; V1-V20) preceding said at least one first frame (V1 . V2-V20) of said second frames (V1 P-V3P V1-V20) 
and being contained in a respective message (M1P. M1) preceding a respective message (M1P, M1) containing 

45 i at 'I? 55 ' Gne first ^rame (VI, V2-V20) of said second frames (V1 P-V3P; V1-V20) and said state information 

. : . (oil), and : , ..• • . ■ ;/ - -. 

" providing said messages (M1P, M1) for the transmission to a second o>m (CD, T31, T12) 

11 - rnodule'for a second. communication device (CD; T31; T12), said second communication device 
: ^ D »J?1^ (CD; T11 ; S21; ACC) according to claim 

module (CO; DEC, PI) to be executed by a control ^meahs (CPU ) of said second 

wmmunicaton device (CD;^3^ skid second communicatidh device 

(CD; T31, T12) carrying out the steps of 

- receiving messages (M1P, M1) sent by said first communication device (CD; T1 1, S21, ACC), and 



50 
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decoding said second frames (V1P-V3P; V1-V20) contained in said messages (M1P, M1), thereby evaluating said 
respective state information (SI1) contained in a respective message (MiP, M1) to decompress said respective at 
least one first frame (V1, V2-V20) of said second frames (V1P-V3P; V1-V20) being contained in said respective 
message (M1P, M1). 

12. A program storage device, in particular a computer diskette, a digital versatile disc or a hard disk, having a 
first program module as claimed in claim 10 recorded thereon and/or having a second program module as claimed in 
claim 11 recorded thereon. 
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